【他山之石】如何将MATLAB中开发的深度学习应用部署到NVIDIA Jetson Xavier NX
“他山之石,可以攻玉”,站在巨人的肩膀才能看得更高,走得更远。在科研的道路上,更需借助东风才能更快前行。为此,我们特别搜集整理了一些实用的代码链接,数据集,软件,编程技巧等,开辟“他山之石”专栏,助你乘风破浪,一路奋勇向前,敬请关注。
地址:https://www.zhihu.com/people/lao-xiu-60
如果要将MATLAB里的一个从摄像头实时获得画面并进行分类的例子跑在NVIDIA Jetson Xavier NX上,我该怎么做?
01
02
sudo apt-get install libsdl1.2-dev v4l-utils sox libsox-fmt-all libsox-dev
PATH="/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
LD_LIBRARY_PATH="/usr/local/cuda/lib64/"
3. OpenCV
03
CUDA Version : 10.2
cuDNN Version : 8.0
TensorRT Version : 7.1
OpenCV Version : 4.1.1
04
>> hwobj= jetson('jetson-host','user','password')
Checking for CUDA availability on the Target...
Checking for 'nvcc' in the target system path...
Checking for cuDNN library availability on the Target...
Checking for TensorRT library availability on the Target...
Checking for prerequisite libraries is complete.
Gathering hardware details...
Checking for third-party library availability on the Target...
Gathering hardware details is complete.
Board name : NVIDIA Jetson AGX Xavier
CUDA Version : 10.2
cuDNN Version : 8.0
TensorRT Version : 7.1
GStreamer Version : 1.14.5
V4L2 Version : 1.14.2-1
SDL Version : 1.2
OpenCV Version : 4.1.1
Available Webcams :
Available GPUs : Xavier
hwobj =
jetson - 属性:
DeviceAddress: 'sha-xaviernx'
Port: 22
BoardName: 'NVIDIA Jetson AGX Xavier'
CUDAVersion: '10.2'
cuDNNVersion: '8.0'
TensorRTVersion: '7.1'
SDLVersion: '1.2'
V4L2Version: '1.14.2-1'
GStreamerVersion: '1.14.5'
OpenCVVersion: '4.1.1'
GPUInfo: [1×1 struct]
WebcamList: []
>> envCfg = coder.gpuEnvConfig('jetson');
envCfg.DeepLibTarget = 'cudnn';
envCfg.DeepCodegen = 1;
envCfg.Quiet = 0;
envCfg.HardwareObject = hwobj;
coder.checkGpuInstall(envCfg)
Compatible GPU : PASSED
CUDA Environment : PASSED
Runtime : PASSED
cuFFT : PASSED
cuSOLVER : PASSED
cuBLAS : PASSED
cuDNN Environment : PASSED
Deep Learning (cuDNN) Code Generation: PASSED
ans =
包含以下字段的 struct:
gpu: 1
cuda: 1
cudnn: 1
tensorrt: 0
basiccodegen: 0
basiccodeexec: 0
deepcodegen: 1
deepcodeexec: 0
tensorrtdatatype: 0
profiling: 0
>> envCfg = coder.gpuEnvConfig('jetson');
envCfg.DeepLibTarget = 'tensorrt';
envCfg.DeepCodegen = 1;
envCfg.Quiet = 0;
envCfg.HardwareObject = hwobj;
coder.checkGpuInstall(envCfg)
Compatible GPU : PASSED
CUDA Environment : PASSED
Runtime : PASSED
cuFFT : PASSED
cuSOLVER : PASSED
cuBLAS : PASSED
cuDNN Environment : PASSED
TensorRT Environment : PASSED (Warning: Deep learning code generation has been tested with TensorRT v7.2. The provided TensorRT library v7.1 may not be fully compatible.)
Deep Learning (TensorRT) Code Generation: PASSED
ans =
包含以下字段的 struct:
gpu: 1
cuda: 1
cudnn: 1
tensorrt: 1
basiccodegen: 0
basiccodeexec: 0
deepcodegen: 1
deepcodeexec: 0
tensorrtdatatype: 1
profiling: 0
05
%% Object Detection Using YOLO v3 608x608
function out = yolov3_detection() %% Update buildinfo with the OpenCV library flags.
%opencv_link_flags = '`pkg-config --cflags --libs opencv`'; % opencv 3
opencv_link_flags = '`pkg-config --cflags --libs opencv4`'; % opencv 4
coder.updateBuildInfo('addLinkFlags',opencv_link_flags);
%coder.inline('never');
% Connect to webcam
hwobj = jetson;
wcam = webcam(hwobj, 1, '1280x720');
img_w = 1280;
img_h = 720;
player = imageDisplay(hwobj);
%%
orgImg = snapshot(wcam);
image(player, orgImg);
%%
imgSize = 608;
out = zeros([img_h img_w 3], 'uint8');
ratio = min(imgSize/img_w, imgSize/img_h);
% Image height and width after resizing image
w = round(img_w * ratio);
h = round(img_h * ratio);
st_h = round((imgSize - h)/2) + 1;
st_w = round((imgSize - w)/2) + 1;
fps = 0;
while true
orgImg = snapshot(wcam);
orgImg = fliplr(orgImg);
in = im2single(orgImg);
% img = imadjust(img, stretchlim(img,[0.01,0.80]));
% img = histeq(img);
%Creating background
in3 = ones(imgSize, imgSize, 3, 'like', in) * 0.5;
in2 = imresize(in, [h, w]); %,'Method','bilinear','AntiAliasing',false);
in3(st_h:st_h+h-1, st_w:st_w+w-1, :) = in2;
tic; % Count FPS
predictions = yolov3_detect(in3);
elapsedTime = toc;
fps = .9*fps + .1*(1/elapsedTime);
% post-processing and display the results
out = postProcess(predictions, orgImg, w, h);
out = insertText(out, [1, 1], sprintf('FPS %2.2f', fps), 'FontSize', 26, 'BoxColor', [0,150,0]);
out = imresize(out, [img_h img_w]);
image(player, out);
end
end
06
%% connect hardware
hwobj = jetson('host-name','user','password');
%% Generate CUDA Code for the Target Using GPU Coder
% To generate a CUDA executable that can be deployed on to a NVIDIA
% target, create a GPU code configuration object for generating an executable.
cfg = coder.gpuConfig('exe');
cfg.GenerateReport = true;
cfg.Hardware = coder.hardware('NVIDIA Jetson');
cfg.DeepLearningConfig = coder.DeepLearningConfig('tensorrt');
cfg.DeepLearningConfig.DataType = 'fp16';
cfg.GpuConfig.ComputeCapability = '7.0';
cfg.Hardware.BuildDir = '~/remoteBuildDir';
cfg.GpuConfig.SelectCudaDevice = 0;
cfg.GenerateExampleMain = 'GenerateCodeAndCompile';
codegen('-config ',cfg,'yolov3_detection', '-report')
%% Run the Sobel Edge Detection on the Target
% Run the generated executable on the target.
%
pid = hwobj.runApplication('yolov3_detection');
[1]https://ww2.mathworks.cn/matlabcentral/fileexchange/75305-yolov3-yolov4-matlab?s_tid=srchtitle
[2]https://ww2.mathworks.cn/help/gpucoder/ug/deployment-classification-webcam-images-on-NVIDIA-Jetson-TX2.html
[3]https://ww2.mathworks.cn/help/gpucoder/gs/install-prerequisites.html
[4]https://www.mathworks.com/help/gpucoder/gs/setting-up-the-toolchain.html
[5]https://zhuanlan.zhihu.com/p/350323501
[6]https://developer.nvidia.com/jetson-nx-developer-kit-sd-card-image
[7]https://ww2.mathworks.cn/help/releases/R2021a/supportpkg/nvidia/ug/install-and-setup-prerequisites.html
[8]https://www.mathworks.com/help/gpucoder/ug/deployment-classification-webcam-images-on-NVIDIA-Jetson-TX2.html
[9]https://www.jetsonhacks.com/2018/11/08/build-opencv-3-4-on-nvidia-jetson-agx-xavier-developer-kit/
本文目的在于学术交流,并不代表本公众号赞同其观点或对其内容真实性负责,版权归原作者所有,如有侵权请告知删除。
“他山之石”历史文章
反卷积和上采样
PyTorch vs LibTorch:网络推理速度谁更快?
MMOCR:OpenMMLab 全流程的文字检测识别理解工具箱
Pytorch技巧:DataLoader的collate_fn参数使用详解
Pytorch优化器及其内置优化算法原理介绍
神经网络学习 | 鸢尾花分类的实现
Pytorch 基础-tensor 数据结构
Transformer风险评分:实体嵌入+注意力机制
Pytorch:eval()的用法比较
ONNX模型文件->可执行文件 C Runtime通路 具体实现方法
Pytorch mixed precision 概述(混合精度)
Weights & Biases (兼容多种深度学习框架的可视化工具WB中文简介)
GCN实现及其中的归一化
Pytorch Lightning 完全攻略
Tensorflow之TFRecord的原理和使用心得
更多他山之石专栏文章,
请点击文章底部“阅读原文”查看
分享、点赞、在看,给个三连击呗!